A Network Sampling: From Static to Streaming Graphs
نویسندگان
چکیده
Network sampling is integral to the analysis of social, information, and biological networks. Since many real-world networks are massive in size, continuously evolving, and/or distributed in nature, the network structure is often sampled in order to facilitate study. For these reasons, a more thorough and complete understanding of network sampling is critical to support the field of network science. In this paper, we outline a framework for the general problem of network sampling, by highlighting the different objectives, population and units of interest, and classes of network sampling methods. In addition, we propose a spectrum of computational models for network sampling methods, ranging from the traditionally studied model based on the assumption of a static domain to a more challenging model that is appropriate for streaming domains. We design a family of sampling methods based on the concept of graph induction that generalize across the full spectrum of computational models (from static to streaming) while efficiently preserving many of the topological properties of the input graphs. Furthermore, we demonstrate how traditional static sampling algorithms can be modified for graph streams for each of the three main classes of sampling methods: node, edge, and topology-based sampling. Experimental results indicate that our proposed family of sampling methods more accurately preserve the underlying properties of the graph in both static and streaming domains. Finally, we study the impact of network sampling algorithms on the parameter estimation and performance evaluation of relational classification algorithms.
منابع مشابه
ComPAS: Community Preserving Sampling for Streaming Graphs
In the era of big data, graph sampling is indispensable in many settings. Existing sampling methods are mostly designed for static graphs, and aim to preserve basic structural properties of the original graph (such as degree distribution, clustering coefficient etc.) in the sample. We argue that for any sampling method it is impossible to produce an universal representative sample which can pre...
متن کاملUsing an Evaluator Fixed Structure Learning Automata in Sampling of Social Networks
Social networks are streaming, diverse and include a wide range of edges so that continuously evolves over time and formed by the activities among users (such as tweets, emails, etc.), where each activity among its users, adds an edge to the network graph. Despite their popularities, the dynamicity and large size of most social networks make it difficult or impossible to study the entire networ...
متن کاملCan Sampling Preserve Application Adoption Process over OSN Graphs?
Can Sampling Preserve Application Adoption Process over OSN Graphs? Mohammad Rezaur Rahman, Chen-Nee Chuah {mrrahman, chuah}@ucdavis.edu Abstract Online social network (OSN)-based applications often rely on user interactions to propagate information or to recruit more users. Understanding the adoption or cascade process of an idea, a product, or a new application over OSN graph is of great inte...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملSparsification Algorithm for Cut Problems on Semi-streaming Model
The emergence of social networks and other interaction networks have brought to fore the questions of processing massive graphs. The (semi) streaming model, where we assume that the space is (near) linear in the number of vertices (but not necessarily the edges) is an useful and efficient model for processing large graphs. In many of these graphs the numbers of vertices are significantly less t...
متن کامل